Pedestrian Foot Traffic Business Case

Authored by: Ngoc Dung Hyunh

Duration: 40 mins 

Level: Advanced        Pre-requisite Skills:Python

Scenario

As a business owner, I want to know how much pedestrian foot traffic occurs in a potential new business location

I would like to be able to compare pedestrian traffic through the day, week and month.

I would like to see visualisations that can help in deciding where I can start my new venture

What this Use Case will teach you

At the end of this use case you will:

A brief introduction to the datasets used

The City of Melbourne provides comprehensive pedestrian traffic data, as well as sensor location data. In this use case we will utilise the Pedestrian Counting System - Past Hour (counts per minute) dataset

This dataset contains minute-by-minute directional pedestrian counts from the last over from pedestrain sensor devices located across the city. The data is updated every 15 minutes and can be used to determinate variations in pedestrian activity throughout the day.

We also utilise the City of Melbourne's Sensor Locations dataset. This data will be used to extract the location of these sensors which helps with visualisation.

Accessing and Loading data

We aim to make a decision that will be informed by insights based on the latest data.

In order to get this data, we can create a function to extract, transform and load (ETL) pedestrian traffic data every 15 minutes.

First, we will do ETL for Sensor Locations and Pedestrian Counting System - Past Hour (counts per minute).

We will then merge these datasets.

Note: We use the package sodapy to extract from Melbourne Open Data directly. This package is a python client for the Socrata Open Data API. To extract the data from Melbourne Open Data, you must have a dataset id. It can be found as follows:

Columns Description Type
Sensor ID Unique reading ID Categorical
Count Hourly sum of Pedestrians Numerical
Sensor Description A description of where the sensor is located Categorical
Lat Latitude of each sensor Numerical
Lon Longitude of each sensor Numerical

 Using Selenium to Crawl Data

To help drive insight to informing our business scenario, we can compare the data from two different sensor locations.

This means we can compare pedestrian foot traffic in two separate locations over different periods of the current date, as well as over a 4-week period.

However, in order to get this data, we need to crawl the data from http://www.pedestrian.melbourne.vic.gov.au.

Note Because we cannot use Sodapy to crawl the data from this website, We use another package Selenium to extract the data.

 Basic Map Visualisation

Visualising data provides an easy way for us to detect patterns within the data. Using the python library folium, we can create a live pedestrian traffic map of the City of Melbourne.

This map will represent all of the data of the pedestrian sensor locations within the City of Melbourne.

This map is updated every 15 minutes for up-to-date pedestrian traffic information.

As we can see, the majority of the pedestrain traffic takes place around Swanston street in the heart of the city

 Basic Trend Visualisation

Having developed a basic visualisation of our data, we can now produce a visualisation that shows us potential trends in pedestrian foot traffic.

We can create a line chart that consists of two lines of different sensor locations at different times of day.

Based on this visualiation, we can compare the rush hour and off-peak periods of the two sensors.

This information has the potential to help inform our business owner of his new potential location.

Based on this visualisation, if we want to open a business that caters for the lunch time crowd then we may wish to locate our business near Bourke street mall.

However, if we want to open a business that caters to the Dinner crowd, perhaps the area in and around Melbourne Central provides better incentives.

This goes to show just how dynamic life is in the City of Melbourne

 Geographic Filter

Sometimes, we require more specific information that may not be obtainable from the sensor information alone. In this example, we take specific addresses as data input instead of the Sensor location to provide a more meaningful and insightful result.

We can create a Geographic Filter that will take a specific address such as 100 Flinders Street and create a live map and a live line chart.

The problem is that an address may have multiple pedestrian sensors near it. Therefore, we introduce another variable called Radius. This variable aims to filter all sensors that are within the radius of the target address

We then take a sum of these different sensors to access the pedestrian data of the address.

To produce this filter, we have 8 steps :

  1. Take addesses
  2. Use Google API to get the information about Latitude and Longtitude of these addresses
  3. Find the distance between these addresses and the sensors
  4. Filter sensors are near these addresses
  5. Create a live chart
  6. Filter daily and monthly data
  7. Draw a line chart

We can visualize Pedestrian foot traffic without an address input. The function will return a live pedestrian traffic map in Melbourne and a line chart showing the three busiest pedestrian sensors.

According to the above figures, the corner at Fillinder and Swanton streets, South Bank, and the intersection at Spencer st and Collins St are 3 three busiest areas in Melbourne. In addition, there are more people on the street today compared to last month's average. So to conclude, if a business wants to open their new restaurant in Melbourne, they should locate the restaurant near these areas.

More realistic, we assume that the business wants to find the most suitable place for their restaurant around the desired areas: Flinders Street Station, Southern Cross Station, and Melbourne Central Station. We now analyze pedestrian traffic at these three locations around 300 meters. We will show a line chart that will present the total pedestrian counts of all sensors within 300 meters around these addresses. We also use a live map to show pedestrian counts of all these sensors.

From the above chart, the area around Finder Station has the highest pedestrian count now, today, and one month average. Therefore, the area around Finder Station within 300 meters is the best area to locate the restaurant. To be more specific, we use the live map to choose the best location for the restaurant. We can see that the spot at Crumpler Flinders Lane has the most pedestrian count around the Flinder Station. Therefore, Crumpler Flinders Lane is the best spot to locate the restaurant.

Congratulations!

You've successfully used Melbourne Open Data to visualise pedestrian traffic in and around the City of Melbourne!

References

1: https://data.melbourne.vic.gov.au/Transport/Pedestrian-Counting-System-Monthly-counts-per-hour/b2ak-trbp

2: https://data.melbourne.vic.gov.au/Transport/Pedestrian-Counting-System-Sensor-Locations/h57g-5234

3: https://pypi.org/project/sodapy/#:~:text=sodapy%20is%20a%20python%20client%20for%20the%20Socrata%20Open%20Data%20API.

4: http://python-visualization.github.io/folium/

5: https://pypi.org/project/selenium/